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University of South Australia and University of Maryland Baltimore County 

Abstract: In this paper wc provide a eomprehensive study of statistical infer- 
ence in linear and allied models which exhibit some analytic perturbations in 
their design and covariance matrices. We also indicate a few potential applica- 
tions. In the theory of perturbations of linear operators it has been known for a 
long time that the so-called "singular perturbations" can have a big impact on 
solutions of equations involving these operators even when their size is small. 
It appears that so far the question of whether such undesirable phenomena 
can also occur in statistical models and their solutions has not been formally 
studied. The models considered in this article arise in the context of nonlinear 
models where a single parameter accounts for the nonlinearity. 

1. Introduction 

The problem of estimating parameters and drawing suitable inference about them 
from noisy data that follow classical linear models has a long history in experimen- 
tal science and engineering. A standard assumption in such models is the normality 
and independence of the random error terms, and a complete and certain knowl- 
edge of the associated design matrices. Since there may be situations where such 
assumptions are not valid, or satisfactory, it is of interest to develop appropriate 
modifications of the usual standard solutions. In this regard, some robustness stud- 
ies have been reported in the literature. Kariya and Sinha [ I '_!] discuss distributional 
robustness in terms of deviations from normality of the error terms. They show that, 
under quite general conditions, the standard inference for the fixed effects under 
normality based on the F test continues to hold under a broad class of elliptically 
symmetric distributions. Calafiore and Ghaout [5] discuss some aspects of linear 
models in the presence of uncertainties in the mean and covariance matrix of the 
observed data, and study robust estimation of the underlying parameters. We also 
refer to Chandrasekaran et al. [(i], Ghaoui and Lebret [9] for some related work. 

Our goal in this paper is to study some aspects of analytic perturbations in 
statistical modeling and inference in the context of linear and allied models in the 
presence of certain "structured and systematic uncertainties" in the design and 
covariance matrices. With respect to these uncertainties in the design matrices in 
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linear models, our approach is different from the usual errors-in- variables models 
in that the uncertainties in the design matrices in our context are non-stochastic 
in nature and may be a result of small (but unknown) measurement errors that 
persistently underestimate or overestimate the true values of certain variables. Note 
that these kinds of deviations from true measurements are not random as they 
may be caused either by a persistent deficiency in the measurement sensors or 
even by consistent attempts at deception in a small (and hence hard to detect?) 
way. We propose the name systematic measurement bias to describe this class of 
uncertainties in the design and covariance matrices. The scenario considered here 
may also arise when we have nonlinear models where a single parameter accounts 
for the nonlinearity; in other words, the model is linear if this particular parameter 
is zero. 

Our ability to identify, quantify and analyze this kind of a bias stems from certain 
recent developments in the well established subject of analytic perturbation theory 
of matrices and operators. The reader is referred to the famous treatise by Kato 
[13] for a comprehensive treatment of this subject. However, in this study, results 
obtained by Avrachenkov [2], Avrachenkov and Haviv [3] and Avrachcnkov et al. 
[4] (see Section 2) provide us with tools to assess the impact of these systematic 
measurement biases on the quality of the estimates and the inferences drawn from 
these models. Note that in the theory of perturbations of operators it has been 
known for a long time that some small perturbations can have a big impact on 
solutions of equations involving these operators; these are the so-called "singular 
perturbations" . The question of whether such undesirable phenomena can also occur 
in statistical models and their solutions appears not to have been formally studied 
so far. Our application in the area of principal component analysis (PCA) involves 
perturbations in the estimated sample covariance matrix via perturbations in the 
underlying design matrices while the application in a factor analysis model deals 
with perturbations in the covariance matrix. 

The organization of the paper is as follows. Wc introduce some background of 
perturbation theory and give associated Laurent series and fundamental equations 
in Section 2. The context of linear models with perturbations in the design matrices 
is developed in Section 3, a core section of the paper. We derive the necessary expan- 
sions of the associated estimates of the parameters and their covariance matrices, 
and provide an outline of the inference on estimable linear parametric functions 
under perturbations of the underlying design matrices. Perturbations in principal 
component analysis appear in Section 4, and Section 5 deals with perturbations in 
the context of factor analysis models. Some applications are indicated in Section 6. 

2. Background: Laurent series and fundamental equations 

In this section we briefly review some results concerning analytic perturbations of 
matrices. We refer the reader to Avrachenkov [2] and Avrachenkov et al. [4] for 
details of proofs and derivations of these results. 

Let {Ak}k=o,i,--- — C"^" be a sequence of matrices that defines an analytic 
matrix valued function 

(2.1) A{e) =Ao+ eAi + A^ + • • • . 

The above series is assumed to converge in some non-empty neighbourhood of e = 0. 
We say that A{£) is an analytic perturbation of the matrix Aq = A{0), and that it 
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is a regular perturbation if is nonsingular, and a singular perturbation otherwise. 
Even though all results of this paper can be extended to the above general analytic 
perturbation, the statistical considerations are sufficiently well illustrated by the 
somewhat simpler case of a linear perturbation where (2.1) is replaced by 



(2.2) 



A{e) = A + eB, 



where = A, Ai ^ B, and Ak = 0, k > 2. Assume the inverse matrices A^^{e) 
exist in some (possibly punctured) disc centred at e = 0. Beyond linear pertur- 
bations, we may also have a quadratic or a polynomial perturbation though the 
contributions of the higher order terms will in general be small. Let us consider the 
following example of an analytically (linear) perturbed singular matrix 



Ais) = 
Its inverse is given by 

A-\e) 



1 - e 1 + £ 

1 - 2£ 1 - £ 





■ 1 


1 ■ 




■ -1 1 " 




1 


1 


+ £ 


-2 -1 



-£(1 - 3£) 



l-£ -l-£ 
-l + 2£ l-£ 



It should be clear that the above inverse admits a Laurent series expansion. To 
see this, we just expand det^(£))~^ = l/(— £(1 — 3£) as a scalar power series in 
£, multiply it by adjA(£) and equate coefficients with the same power of £. In this 
case, we have 



( 3 - 9£ 

£ 



1 -£ 
-l + 2£ 



-1 -£ 
1 -£ 



1 


■ -1 1 ■ 




■ -2 4 ■ 




■ -6 12 ■ 


£ 


1 -1 


+ 


1 -2 


+ £ 


3 -6 



A-\s) 



which is obviously a Laurent series. Clearly, the above is a very inefficient method 
of deriving this Laurent series. Fortunately, in the case of a linear perturbation, the 
following result that follows from the analysis in Avrachenkov [2] and Avrachenkov 
et al. [4] provides a much better method. 

If A~^{e) exists in a punctured neighborhood of £ = 0, then it can be expanded 
as a Laurent series 



(2.3) A-\z) 



-Y^i+Yo+eYi + -- 



where ^^^(£) is the singular component of A~^{e) and consists of the part of the 
series made of terms involving negative powers of £, and A^^{e) is the regular 
component consisting of all the remaining terms. The coefficients Yfc, k = — s, —s + 
1, . . . satisfy the following recursions 



(2.4) 



Yk+i = i-YoB)Yk, A; = 0,1,, 



(2.5) 



y-fc-i = (-F-iA)F-fc, k = l,...,s-l. 
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Thus the knowledge of the matrices A, B, Yq and F_i is sufficient to determine 
ah coefficients of tlie Laurent series expansion (2.3). 

There are now two cases to consider. In the case of a regular linear perturbation, 



A-s\e) 







. s, and Yq 



Ar, ■ However the 



singularly perturbed case is a little more complicated even when the perturbation 
is linear. Firstly, the order of the singularity, the index s in (2.3), which is also 
known as the pole of the expansion of A~^(e), needs to be determined. Secondly, 
the identity A{e)A~^(£) = I provides a set oi fundamental equations obtained by 
equating the coefficients of like powers of e that supplies the unic^uc set of coefficients 
of the expansion (2.3). Fortunately, only s + 1 of these are needed to determine 
Y_s, • • • YLi, Fo uniquely. 

In particular, for each t = 0, 1, ... let us define the following augmented matrix 



Ao 

Al Ao 

A2 Al Ao 

At At-^ ■ ■ ■ 



... 

... 

... 

Al Ao 



Of course, in the case of a linear perturbation the above augmented matrices have 
the form 



(*) _ 



A 








... 


B 


A 





... 





B 


A 


... 










B A 



A 



The determination of the order of the pole s in (2.3) can be achieved with the help 
of the following result that is proved in Avrachcnkov [■-^] and Avrachenkov et al. [4]. 

Theorem 2.1. The order of the pole s is given by the smallest value oft for which 
rank[^^*-'] = rank[^^*~"'^)] +n, where n is the dimension of A{e). 

Now if we define two block column matrices y :~ [Y^g, ■ ■ ■ , Y^i, Yq]"^ and J :~ 
[0, . . . , 0, 1]'^ and let s be the order of the pole of the expansion of A^^{e), then the 
coefficients of the non-positive powers of e in (2.3) are given as the unique solution 
of the linear system: 



(2.6) 



A^'^y = J. 



Next, we obtain a recursive formula for the Laurent series coefficients in (2.3). To 
eliminate negative indices in subscripts, it will be convenient to define Xk '.— ^fc-s, 
fc > and to introduce the Moorc-Pcnrose generalized inverse of the augmented 

matrix A^^^ In particular, let 5'^' [A^'^^Y' t>e the Moorc-Pcnrose generalized 



inverse of A^^^ and define the submatrices g|'^ £ (jnxn < i, j < t hy 



Gn 



G 



{■■<) 



G 



{■■<) 
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(s) 

where the dimensions and locations of G,-^ are in correspondence with the block 
structure of A^^"^ . 

In Avrachenkov et al. [4] it is shown that the first n rows of the generalized inverse 
G^^\ namely, [Gqq ■ ■ ■ Gq]] are all that is needed to calculate the coefficients of 
the expansion (2.3). Indeed, the following recursive formula provides the solution: 



(2.7) 



j=0 i=l 



is) 



initialising with Xq — Gq 

To illustrate these formulae we next use them to verify the first two coefficients 
in the expansion of A~^{e). First, we note that .4^'' = A, and 



At'^ 



A 
B A 



It is now easy to check that rank[^(^^] = 3 = rank[^'^°^] 
Theorem 2.1, the order of the pole is indeed ,s = 1 and 



1 + 2. Thus, by 
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-.5 


1 



It immediately follows from the notation introduced earlier and (2.7) that 



r_i — aq — — 



-1 1 
1 -1 



Now, using (2.7) again and the fact that Ak = for k > 2, we obtain 



Yo^Xi^ Gg' [h - BXo] + G'o,' [0 - OXo] , 



{i)r 



" 1 1 ■ 






■ 1 


■ 




■ -1 1 ■ 




■ -1 1 ■ 






■ -2 4 ■ 


-.5 -.5 









1 




-2 -1 




1 -1 






1 -2 



that is, 



It is, perhaps, worth commenting that while, at first sight, the dimension of A^'^^ 
may appear prohibitively large, it appears that the singularity of order s = 1 occurs 
very frequently (this is made precise in Avrachenkov [2]). 

3. Perturbed linear model 

We begin with the standard linear model 

(3-1) Y„xl = A'^(£)„xm/3mxl + €nxl 

and assume that the underlying design matrix X{e) admits an analytic perturbation 
expansion of the form 



(3.2) 



X{e) =Xo + eXi + 6^X2 + 6^X3 
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where Xq, Xi, X2, ■ ■ ■ are known component design matrices, and the error terms ^ ~ 
A^[0, cr^], which is the usual assumption in linear models. Here e > is referred to as 
a perturbation parameter which is typically small and also unknown. As mentioned 
above, our goal in this section is to study the effects of the perturbation parameter 
e on the statistical inference about f3. In view of the representation of X{£), the 
relevance of the discussion in Section 2 is obvious. 

Towards this end, we first note that for a given e. the maximum likelihood (also 
the least squares) estimate of /3, which is obtained by minimizing [Y ~ X'^ {e) f3]' [Y — 
X'^{e)j3] with respect to /3, is given by 

(3.3) m = [X{e)X^{e)]-'X{e)Y. 

Moreover, an estimate of the error variance cr^ is obtained from the residual sum 
of squares SSE{e) given by 

(3.4) SSEie) ^ Y^[I^~X^ie){X{e)X^ie)}~'X{e)Y 

= Y^P(e)Y 

where P{s), the usual projection operator, is given by 

(3.5) P(e) = [/„ - X^ie){Xis)X^ie)}-'Xis)]. 

To study the dependence of /3(e) and SSE{e) on e, let us recall from (3.2) that 
X{e) ^ Xo+ eXi + 6^X2 H , which yields 

(3.6) X{e)X^{e) = X^X^ + eiX^Xj + X^X^} 

+ e^ {XoXl + XiXj + X2X^] 
+ • • • 

= Bo + eBi + e^Ba + ■ • ■ 

where Bq = XqX^ and Bi is symmetric, Vi. We now distinguish between two cases 
depending on whether Bq is nonsingular or singular. 

Case 1: i?o is nonsingular. In this case which corresponds to a regular perturbation 
of ^(e), we readily get for small e 

(3.7) (X(e)X^(e))-i = (Bo + eSi + e'^a + • ■ • )"' 

= Co+eCi+e2C2 + --- 

where Co = B^^ ^ Cq"^ = i?o and the remaining C^'s can be computed following 
the ideas of Section 2. It then follows from (3.3) that 

(3.8) ^(e) = {Co+eCi+e^C2 + ---){Xo+eX^ + ---)Y 

^ CoXoY + e[CiXo + CoXi]Y 

+ e2[Mo + CiXi + CoX2]Y + --- 

= ^ + e[CiCo-i^ + CoXiY] 

+ e^[C2C^^h + CilXiY + CaX2Y] 

+ ■■■ 

= [Co + eCi+e2C2 + ---]Co-i^ 

+ e[CoXi Y] + e^ [CiX^Y + C0X2Y] + ■■■ 

« /3 + e [Ci Bq^ + C^XiY] for small e 
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where 

is the usual regression coefficient estimate without any perturbation parameter. 
Moreover, from (3.7), we get 

(3.9) X^{e)iXie)X^ie))-'Xie) 
^ {Xo + eXi + e^X^ + ■ ■ 

X (Co + eCi + e^Ca + • • • )(^o + e^i + • • • ) 
= XqCqXq + e[Xj CqXq + XqCiXq + XqC^Xi] 
+£^[^2"^o-'^o + -'^cfC'aXo + XqCqX2 + X'^CiXq 
+XlCiXi + x{CqXi] 

H 

w X^CoXq + e[Xf CqXo + X^CiXo + X'^CqXi] for small e 

which readily yields the following expansion of the projection operator 

(3.10) P{e) = I„ - X^{e){X{e)X^{e)}-'Xie) 

—e'^lXjCoXa + XjCa^o + XqCoX2 + XiCiXq 
+XqCiXi + X'^CqXi] 



so that, from (3.4), we get 

(3.11) SSE{e) = SSE-eY'^[XlCoXo+X^CiXo + X^CoXi]Y 

—e'^Y^ [xJCqXc + XjCa^o + X0C0X2 + X^CiXq 
+X^CiXi + X^CoXi] Y 

where SSE = Y'^(/„ — XqCoXo)Y is the standard error sum of squares without 
any perturbation parameter. The expansions in (3.8) and (3.11) clearly reveal the 
effects of the perturbation parameter e on the estimates of f3 and cr^. 

We now turn our attention to the problem of testing Hq : (3 ^ (3^ vs Hi : (3 ^ 
Under normality and independence of the error terms, a standard test in this context 
is the familiar F-test based on the F-statistic given by 

^ Cm - f3,r[X{e)X'^{e)me) ~ (3,)/m 
SSE{e)/{n~m) 

(3.12) 

-^111, n— 111 (Ho). 



To study the dependence of F{e) on e, we proceed as follows. From (3.8) and (3.11), 



24 



J. A. Filar et al. 



we get 



(3.13) (/3(e)) - l3,f[X{s)X^{e)me) - /3,) 

+e^{C2Xo + CiXi + CoX2)Y + ■ ■ 

[Bo+eBi+e^B2 + ---] 

[{^ - l3o) + e{C\Xo + CoXi)Y 

+e^{C2Xo + C\Xi + CoX2)Y + ■ ■ ■] 

+ e[iCiC^'^ + CoXiYfBo{^-f3o) 

+ 0-/3^fBi{^-f3„) 

+ (^ - f3„fBo{CiC^'^ + CoX^Y)] 
+ e^[2{C2Co'^ + CiXiY + CoX2YfBoi^ - /3o) 
+ 0-f3^fB20-f3„) 

+2{CiC^'^ + CoX.YfBii^ - /3o) 

+ iC2Co'^ + CiXiYBo{C2Co'^CiXiY + C0X2Y + C0X2Y) 



0-f3ofBo{f3-/3o) 
+e[2fc^'Bo{fi - f3o) + 2Y^ XjCfi - 13,) 



Hence, up to the first order approximation, we can express F{e) as F{e) = N{e)/ 
D{e) where 



N{e) = {(^-/3o)^So(/9-/3o)+£[2/9 C,^ B,{^ - (3,) + 2Y^ Xj - (3,) 
+ C^-l3,fB^{^-(3,)]}/m 



and 



, [SSE - eY^jXfCoXo + Xq^CiXq + X^CoX,)Y] 

(n — m) 



Perturbations and bias in statistical modeling 25 

The expression F{e) can be further expanded as 
(3.14) F{e) = ^-!l-pl[{^-(3,fBo0-po) 

[1 



SSE 

eY'^ {X'^CpXp + XqCiXq + XqCoXi)Y 
SSE ^ 



= Po + e[{^^)-2 ^ 

n-m 0-f3afBo0-f3^) 
^ m SSE 
Y^ {XiCqXq + Xf^CiXo + XjCoXi}Y 

where F^, the usual F statistic for testing Hq versus Hi, is defined as 

- /3o)^i?o(^ - /3o)/"^ 



(3.15) = 



SSE/{n - m) 



Lastly, we discuss the nature of the confidence set for /3 under the perturbed 
linear model (3.1). Obviously, under the assumption of normality and independence 
of errors, the (1 — a) level confidence set for f3 is given by the following. 

C{e) = {(3:{^{e)-(3f{X{e)X^{eyme)-f3) 

(3.16) < F„.™„-55i;(£)} 

n — m 

— ■ — ; m,n — ra\- 

The dependence of C(£) on e can then be studied based on the expansion of F{e) 
given in (3.14) , and we have up to the first order 

SSE/{n — m) 

n - m 2^'^B„CiBo0 -(3)+ Y^Xf (^ - /3) 
^' n ^ SSE 
i^^f3fBoi^-f3) Y^jXlCpXp + Xq^CiXq + X^CoXi)Y 
SSE ' SSE ^ 

Fa ; m,n — m}- 

Case 2: Bq = X^Xq is singular. We now turn our attention to the case when 
Bq is singular which is the singular perturbation situation and study the nature 
of dependence of the above estimates and test statistics on e. We define B{e) = 
X{e)X'^{e), and consider first the case when 

oo 

(3.18) B-'{e)X{s)^Y.''^k- 

k=0 
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The above condition will not always hold, and necessary and sufficient condi- 
tions on the basic design matrices Xq, Xi, . . . can be developed under which such a 
representation holds by equating X{e) to B{e) x J2'kLo^''^k- ^'^^ example, equat- 
ing the first two terms yields conditions on the existence of Cq and satisfying: 
Xo = (XoXo^)Co* and X, = {XoX^)C* + (X.Xj + XoXl)C^. 

Under the above representation (3.18), let j3 = CqY which is the limiting es- 
timate of /3 as £ approaches 0. It should be noted that /3 = Bq^XqY for some 
generalized inverse Bq of Bq = XqXq . To see this, note that under the represen- 
tation (3.18), 

CJO 

X{e) ^ B{e)B-\e)X{e) ^ X{e)X^{e){Y^ e^Cl). 
Taking the limit as e ^ 0, we get 

V V vT r> y^* 



Thus 



for some generalized inverse Bq^ of Bq. 

Returning to the error sum of squares SSE{e), recall that 

SSE{e) = Y^[/ - X^{e)B-\e)X{e)]Y 

and 

(3.19) P{e) = l^-X^{e)B-\e)X{e) 

OO 

= Y^e'Dl (say), 

fc=0 

where we have used the representation (3.18). We define lim^^o SSE{e) ~ DqY, 
where 

D*„ = P(0) = /„ - X^C^ = /„ - X;^B^Xo, 

which is always idcmpotent. Thus SSE = SSE{e = 0) = Y^[/„ - XoB^Xo]Y 
and because of the idempotency of Dq wc conclude that follows a non-central 
chisquarc distribution with non-ccntrality parameter l3'X{e)D^X'^{e)(3/a'^ and de- 
grees of freedom v given by 

ly = rank{P(£ = 0)} 
= n- Ta,nk{XoB^ Xo) 
= n ^ r, r = rank(Xo). 

Returning to the testing problem, let us define 

Cm-f3,r[Xie)X^ie)ms)-fB,]/m 



F(e) = 



SSE{e)/{n-m) 
C^-f3oVBoC^^f3o)/m 



SSE/{n - 
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We note that when e = 0, for testing Hq: X/3 = X/3q, the F statistic, say Fq, is 
given by 

° SSE/{n - r) 

where 

^ = B^XoY 

and r is the rank of Xq. Thus is a scalar multiple of Fq. 

When the representation (3.18) assumed earlier under Case 2 does not hold, 
we would actually have a Laurent series expansion of B~^{e) (as opposed to a 
Maclaurin series), resulting in 

oo 

(3.21) B-\s)X{e)= ^ e'C*,. 

k— — s 

In this case 

^(e) = B-\e)X{s)Y 

has no limit as e | 0, but 

E[^{e)] = B-^{e)X{e)X'^{e)(3 = (3, Ve 
=> /3{e) is an unbiased estimate of /3. 



Var[^(e)] ^ a^B-\e)X{e)X^{e)B~\e) 
= a^B-\e). 

Moreover, (3{e) is also unbiased as e and 

Var[^(e)] - a^B^XoX^B^ 
- <^ ^0 ■ 



Obviously, /9(e) and /3 are equivalent for small e if ^(e) ^ | q -Bq^- 

Remark 3.1. Interestingly enough, as proved in Avrachenkov [2], the projection 
matrix P{e) is uniformly bounded in e > 0, and it does admit a Maclaurin series 
expansion at e = 0, irrespective of the nature of B~^{e)X{e). Incidentally, the 
following simple example demonstrates that the projection matrix P{s) can be 
even independent of s. 

Example 3.1. Take 



Then 



2 \ 

2 

1 

1 Oj 



/O 1\ 

1 

1 

\o ij 



X^ie) 



I 2 e 
2 e 
1 e 
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One might imagine that the two cohimns of X'^{e) are generated by two sensors: 
the first one is accurate but the second one systematically overestimates the true 
reading of by the amount e. If 





1\ 


r 

2 


1 


1 


1 


V 1 


1 / 



then we have 



h - P{e) = {XX^ )-^X 



Thus P{e) is free of e. Hence, in this example, the systematic bias leaves the pro- 
jection matrix (and hence also the residual sum of squares) unchanged. 

Remark 3.2. Incidentally, an estimate of e can be provided by minimizing the error 
sum of squares, SSE{e), which is essentially the maximum likelihood estimate of e. 
Recalling from (3.11) the representation of SSE{e) and keeping terms up to e^, we 
can verify that SSE(e) is convex in e, and hence the minimizcr solution is given by 

(3.22) 

^ Y-^[Xj'"CoXo + XqCiXq + XjCoXi]Y 

2Y^[XjCoXo + XqC2Xq + XqCoX2 + XJCiXq + X^CiXi + Xj CqXi\Y 

When the perturbation parameter e is unknown, the point estimate e provides 
information about its magnitude. It is quite likely that the parameters of primary 
interest in a perturbed linear model are (3 and cr^. After obtaining e, the maximum 
likelihood estimate of ^ can be computed as /3(e). Similarly cr^ can be estimated. 



4. Perturbation in principal component analysis 

In this section we discuss some effects of perturbation in principal component anal- 
ysis (PC A). It is well known that in the standard PC A approach, one tries to reduce 
the dimension of a random p x 1 vector y with a mean vector and a dispersion ma- 
trix S by successively taking suitable orthogonal linear combinations of y such that 
the resultant linear combinations, known as principal components, have decreasing 
variances; see, for example, Anderson [;]. 

Generalizing the model (3.1) to a multivariate set up, we write 

(4.1) Y„xp = (e) 

where Y is a matrix of observations, the matrix X(e) is assumed to be of full column 
rank, and the rows of ip are assumed to be independently distributed with mean 
vector and a common dispersion matrix S. An estimate of the dispersion matrix 
S is then obtained as 

(4.2) E(e) = Y'^P(e)Y 

where, as before, P(e) is the projection matrix defined in (3.5). The sample principal 
components d^y, d^^y, . . . arc then obtained by solving the equations: 



(4.3) 



t{e)d{e) = \{e)d{e) 
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where A(e)'s are the eigenvalues of S(£). When e = 0, which means absence of any 
perturbation, the usual PCA applies. When e > 0, the sample eigenvalues and the 
eigenvectors will depend on the perturbation parameter s. There are quite a few 
papers dealing with the behaviors of such eigenvalues and eigenvectors as functions 
of e from a deterministic point of view. 

In our context, the statistical properties of the random eigenvalues and eigenvec- 
tors can be studied based on such behaviors. It turns out that while the behaviors 
of the eigenvectors are smooth, those of the eigenvalues are not! 

Following Remark 2.4, we can write P{e) as 

(4.4) P(e) = Po + ePi +e^P2 + --- 
which readily yields 

(4.5) S(£) ^S-o+e^i+e^^a + '- 
where Sq = SSE. The eigenvalues A(e) then satisfy the determinantal equation: 

(4.6) \So+eSi+e^S2 + A(e)/p| = 

and once the A(e)'s have been determined, the eigen vectors d(e)'s are obtained 
from the equations given in (4.3). 

In the special case when P{e) = Pq + ePi, the equation (4.6) reduces to 

(4.7) \So + eSi - X{e)Ip\ = 
which when further specialized to p — 2 yields 

Ai(£) + A2(£) = triSo) + etr{Si)Xi{e)X2{e) = \So + eSi\. 

Since (Ai(e) - A2(e))^ = (•^i(^) + •^2(£))^ — 4Ai(e)A2(£), up to the first order of 
approximation, we get 



[(Slll - S122)(S011 - S022) + 4soi2Sll2 

A 



(4.8) Ai(e)- A2(e) 
where 

(4.9) ^= [(soii-So22)'+4s2,2]i/2. 
In the above, we have used the notation 



5o = 



5011 S012 

5012 5022 



Si 



5111 S112 

5112 S122 



The equations (4.8) and (4.9) can be simultaneously solved to get first order 
approximations of Ai(£) and A2(e). Once this is done, first order approximations of 
the two eigenvectors di(e) ^ dio + edn and d2(e) ~ d2o + £d2i are obtained by 
solving the equations: 



(4.10) 



[5*0 + £5'i](dio + £dn) = Ai(£)(dio + £dn) 
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(4.11) [5*0 + £5i](d2o + £d2i) = A2(e)(d2o + ed2i) 

Writing Ai(e) ~ Aio+eAn, we readily get dio and dn as solutions of the equations: 

•Sodio = AiodioS'odii + S'ldio = Audio + Aiodn. 

Likewise, the components of the other eigenvector d2(£) can be obtained. 

It would indeed be challenging to study the statistical properties of the resultant 
eigenvectors and eigenvalues. 

5. Perturbation in factor analysis 

In this section we discuss some consequences of perturbation in the context of factor 
analysis (FA). 

Let Y = (Yi, 1^27 ... , Yp) be a vector of manifest variables. Then the FA model 
postulates that (Christensen [7] and Johnson and Wichern [10]) the manifest vari- 
ables are linear functions of some random latent variables plus a residual term. The 
functional relationship is typically expressed as 

(5.1) n = An/i+Ai2/2 + --- + AifeA. + ^i 

Y2 = A2l/l+A22/2H KA2fc/fc+V'2 

Yp = Apl/l+Ap2/2H 'r Xpkfk + i^p 

where /i, /2, . • . , /fc represent the random latent or common factors and (-01, tp2^ 
...,'ijjp) denote the residual or error terms. Here the elements A's arc known as 
factor loadings, the latent variables are assumed to have a normal distribution with 
mean vector and a dispersion matrix <i>, and the residuals are assumed to be 
independent of /'s, and independently normally distributed with mean and a 
variance-covariance matrix = diag(V'ii, ?/'22, . . . , V'pp)- The dispersion matrix E 
of Y is then given by 

(5.2) E = r$r^ + i'. 

The literature on FA usually assumes that the latent variables are orthogonal, 
resulting in $ = and hence the simplification E = FF-^ + 5*. 

Since E~Y ~ 0, the sample covariance matrix S can be used to estimate E, and 
the parameters in F and 5* are estimated by solving 

(5.3) S' = ff^ + *, 

subject to the condition that r-^(*I')^^r is diagonal. Iterative methods can be used 
to solve the above equations. 

The method of maximum likelihood, on the other hand, is based on maximizing 
the likelihood or its logarithm given by 

(5.4) C{r, ■^\S) - -[ln{det(E)} + tr(5E-i)] 

which is directly a function of factor loadings and the error variances in view of 
(5.2). An old but most successful algorithm due to Joreskog [11] which runs in two 
steps, first by maximizing C with respect to F for a fixed and then by maximizing 
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with respect to 5*, can be used to derive the maximum likehhood estimates. Details 
are omitted. 

Returning to the perturbation formulation, wc can introduce the perturbation 
parameter e in the form 

(5.5) *(£) = 4+£*i+e'*2 + --- 

which takes us away from the usual assumption of independence of the latent vari- 
ables and emphasizes that there may be some uncertainties in this assumption. We 
can then rewrite S as 

(5.6) E(£) = r$(e)r^ + 1' 

= r[/fc + £$i + £2$2 + ...]r^ + <i, 
= [rr^ + f] + £r$ir'^ + e^T^2T'^ + ■■■ 

which can be used to compute and expand E(£)~^ and also In |S(£)| in powers of 
£. The maximum likelihood estimates of the factor loadings (elements of F) and 
the variances (elements of VP) would then naturally depend on the perturbation 
parameter e, and it is indeed possible to study the effect of £ on these estimates 
along the same lines as in the previous sections. Details are omitted. 

6. Applications 

In this section wc provide an application of the theory developed in this paper. 
Consider the nonlinear regression of Y on X given by (Gallant [S], Chapter 1, 
Example 1) 

(6.1) Y, ^ 0ixu + 02X21 + 0oe'''''^ + e^, i=l,...,n 

where the Y^'s are the responses of a treatment- control design, xi = I and rep- 
resent, respectively, treatment and control scenarios, X2 represents a variable with 
values between 1 and 2, and X3 is the variable showing age of the experimental 
material. The errors e^'s are assumed to be (normally) distributed with mean and 
variance cr^. For e = 0, this of course is a simple linear regression of Y on {xi,X2) 
for which standard inference applies. For e > 0, this is an application of a non-linear 
regression set up for which drawing appropriate inference about the parameters is 
rather complex. An expansion of e'^^' readily yields the matrices Xq, Xi, X2, ■ ■ ■ 
given in (3.2), from which the relevant quantities /3(e), SSE{e), F{e) and C(e) can 
be computed up to any order of powers of e. 

The data given in Gallant [s], page 4, has all the X2i values equal to one. When 
this is the case, it is clear from (6.1) that 62 and 6*0 are not identifiable when e = 0. 
Consequently, in order to illustrate our methodology, we shall avoid the choice 
X2i = 1 for all i. In the data given in Table 1, the a:2i-values were randomly chosen 
from a uniform distribution on (1, 2). Also given in Table 1 are four data sets, 
simulated based on the model (6.1). The data sets share the same set of values of 
the covariates xi, X2 and X3, and differ in the values of the response variable Y. In 
each case, n = 30. 

To see the effect of e on the standard analysis, we note that for the given data 
sets, Xj : 30 X 3 is a matrix consisting of the first three columns from Table 1, 
Xj' : 30 X 3 is a matrix whose first column are the elements 2:3 and the remaining 
two columns are null vectors, and lastly, : 30 x 3 is a matrix whose first column 
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Table 1 
Data sets 



Xo 


XI 




X2 


X3 


^0 


Yi 


Y2 


1^3 


1 


1 


1 


.3420 


6.28 


6.92 


7.0451 


7.6532 


8.6632 


1 





1 


.5813 


9.86 


4.21 


4.4267 


5.4939 


7.5804 


1 


1 


1 


.1043 


9.11 


4.34 


4.5297 


5.4928 


7.3125 


1 





1 


.6867 


8.43 


3.81 


3.9869 


4.8595 


6.4576 


1 


1 


1 


.5164 


8.11 


3.99 


4.1633 


4.9944 


6.4946 


1 





1 


.5672 


1.82 


3.95 


3.9820 


4.1358 


4.3445 


1 


1 


1 


.9644 


6.58 


6.28 


6.4206 


7.0637 


8.1464 


1 





1 


.5411 


5.02 


5.63 


5.7309 


6.1986 


6.9320 


1 


1 


1 


.0064 


6.52 


4.05 


4.1857 


4.8217 


5.8897 


1 





1 


.8726 


3.75 


5.53 


5.6102 


5.9462 


6.4438 


1 


1 


1 


.0314 


9.86 


5.09 


5.3012 


6.3684 


8.4550 


1 





1 


.9190 


7.31 


6.16 


6.3080 


7.0388 


8.3106 


1 


1 


1 


.6507 


0.47 


5.80 


5.8132 


5.8514 


5.9001 


1 





1 


.7083 


0.07 


4.98 


4.9816 


4.9873 


4.9943 


1 


1 


1 


.1261 


4.07 


4.97 


5.0493 


5.4176 


5.9709 


1 





1 


.1693 


4.61 


5.38 


5.4791 


5.9032 


6.5561 


1 


1 


1 


.8063 


0.17 


7.19 


7.1955 


7.2092 


7.2264 


1 





1 


.7086 


6.99 


5.19 


5.3309 


6.0228 


7.2096 


1 


1 


1 


.4324 


4.39 


6.20 


6.2895 


6.6907 


7.3021 


1 





1 


.5265 


0.39 


5.14 


5.1441 


5.1757 


5.2158 


1 


1 


1 


.7009 


4.73 


4.37 


4.4670 


4.9038 


5.5798 


1 





1 


.5807 


9.42 


3.82 


4.0196 


5.0253 


6.9523 


1 


1 


1 


.5538 


8.90 


5.38 


5.5706 


6.5055 


8.2547 


1 





1 


.4150 


3.02 


3.12 


3.1812 


3.4459 


3.8250 


1 


1 


1 


.8566 


0.77 


5.06 


5.0730 


5.1360 


5.2176 


1 





1 


.5010 


3.31 


4.08 


4.1479 


4.4406 


4.8653 


1 


1 


1 


.1584 


4.51 


5.72 


5.8164 


6.2300 


6.8639 


1 





1 


.3310 


2.65 


3.89 


3.9459 


4.1756 


4.4991 


1 


1 


1 


.4981 


0.08 


6.12 


6.1169 


6.1234 


6.1314 


1 





1 


.0008 


6.11 


2.84 


2.9685 


3.5571 


4.5271 



elements arc (0.5) times the squares of and the remaining two columns are null 
vectors. Using these basic matrices, we readily compute 

30 15 44.8534 

Bo = XqXJ = I 15 15 21.7371 

44.8534 21.7371 69.3119 

1.1523 -0.1314 -0.7045 
Co = Bq^ = ( -0.1314 0.1372 0.0420 
-0.7045 0.0420 0.4571 



294.62 74.55 214.3236 
Bi = XiX^ + XoX[ = I 74.55 

214.3236 

-20.6598 1.3236 9.3912 

Ci = -CqBiCo = I 1.3236 -0.0332 -0.4397 

9.3912 -0.4397 -3.7610 

2035.2007 264.923 738.250 
B2 = X2X^ + XiXi' + XqX^ = I 264.923 

738.250 
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/ -164.4712 32.577 147.827 \ 
C2 ^ ''Bq\B2Co + BiCi) = [ 32.577 -4.3659 -22.4049 . 

\ 147.8266 -22.4049 -110.1712 / 

Hence, using (3.22), an estimate of e for the four data sets is obtained as eo = 
0.0505, ei = 0.0195, £2 = -0.0227, £3 = -0.1861. This results in the estimated 
regression coefBcients and the estimated i^-values (for testing the nuU hypothesis 
Hq : 60 = Oi = 62 = 1), up to the first order, as 



Estimated regression coefficients (standard errors) and F-values 



Data Set 


00 


01 


02 


F 


1 


2.0153 (0.9632) 


1.0642 (0.3329) 


1.6228 (0.6068) 


48.7884 


2 


2.2057 (0.9513) 


1.0629 (0.3287) 


1.5645 (0.5993) 


55.2427 


3 


3.0969 (0.9724) 


1.0545 (0.3360) 


1.2967 (0.6126) 


80.1393 


4 


4.6550 (1.3455) 


1.0537 (0.4650) 


0.8177 (0.8476) 


73.7195 



On the other hand, taking e = 0, the estimated regression coefficients and the 
i^-values, under the same null hypothesis Hq : 6q = 61 = 62 = I as above, are 
obtained in the four cases as 



Estimated regression coefficients (standard errors) and F-values 

when e ~ 



Data Set 


00 


01 


02 


F 


1 


2.0142 (0.9605) 


1.0640 (0.3327) 


1.6235 (0.6064) 


48.8080 


2 


2.2016 (0.9499) 


1.0618 (0.3283) 


1.5673 (0.5984) 


55.3330 


3 


3.0930 (0.9725) 


1.0549 (0.3362) 


1.2986(0.6126) 


80.0064 


4 


4.6511 (1.3450) 


1.0540 (0.4650) 


0.8199 (0.8472) 


73.7090 



It is interesting to note that there is practically no change in the estimated 
regression coefficients or the i^-values when e is zero or non-zero. Thus we conclude 
that for inference concerning the model (6.1), the non-linearity due to the presence 
of e is inconsequential; we might as well assume that e = 0, and carry out the usual 
linear model analysis. 



Acknowledgments. Our sincere thanks are due to Professor Edsel Pena for his 
constructive suggestions which led to improved presentation of results. 

References 

[1] Anderson, T. W. (1984). An Introduction to Multivariate Statistical Analy- 
sis, 2nd ed. Wiley, New York. MR0771294 

[2] AvRACHENKOV, K. E. (1999). Analytic perturbation theory and its applica- 
tions. Doctoral dissertation. University of South Australia. 

[3] AvRACHENKOV, K. E. AND Haviv, M. (2003). Perturbation of nuU spaces 
with application to the eigenvalue problem and generalized inverses. Linear 
Algebra Appl. 369 1-25. MR1988476 

[4] AvRACHENKOV, K. E., Haviv, M. AND HowLETT, P. G. (2001). Inversion 
of analytic matrix functions that are singular at the origin. SIAM J. Matrix 
Anal. Appl. 22 1175-1189. MR1824064 

[5] Calafiore, G. and Ghaout, L. E. (2001). Robust maximum hkelihood 
estimation in the linear model. Automatica 37 573-580. MR1832529 



34 J. A. Filar et al. 

[6] Chandrasekaran, G. G., Gu, M. and Sayed, A. H. (1998). Parameter 
estimation in the presence of bounded data uncertainties. SIAM J. Matrix 
Anal. Appl. 19 235-252. MR1609972 

[7] Christensen, R. (2001). Advanced Linear Modeling, 2nd cd. Springer, New 
York. MR1880721 

[8] Gallant, A. R. (1987). Nonlinear Statistical Models. Wiley, New York. 
MR0921029 

[9] Ghaoui, L. E. and Lebret, H. (1997). Robust solutions to least squares 
problems with uncertain data. SIAM J. Matrix Anal. Appl. 18 1035-1064. 
MR1472008 

[10] Johnson, R. A. and Wichern, D. W. (2002). Applied Multivariate Statis- 
tical Analysis, 5th ed. Prentice Hall, New York. 

[11] JORESKOG, K. G. (1967). Some contributions to maximum likelihood factor 
analysis. Psychometnka 32 443-482. MR0221659 

[12] Kariya, T. and Sinha, B. K. (1989). Robustness of Statistical Tests. Aca- 
demic Press, New York. MR0996634 

[13] Kato, T. (1980). Perturbation Theory for Linear Operators, 2nd ed. Springer, 
New York. 



